Cross-entropy clustering

نویسندگان

  • Jacek Tabor
  • Przemyslaw Spurek
چکیده

We build a general and highly applicable clustering theory, which we call cross-entropy clustering (shortly CEC) which joins advantages of classical kmeans (easy implementation and speed) with those of EM (affine invariance and ability to adapt to clusters of desired shapes). Moreover, contrary to k-means and EM, CEC finds the optimal number of clusters by automatically removing groups which carry no information. Although CEC, similarly like EM, can be build on an arbitrary family of densities, in the most important case of Gaussian CEC the division into clusters is affine invariant, while the numerical complexity is comparable to that of k-means.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entropy-based Consensus for Distributed Data Clustering

The increasingly larger scale of available data and the more restrictive concerns on their privacy are some of the challenging aspects of data mining today. In this paper, Entropy-based Consensus on Cluster Centers (EC3) is introduced for clustering in distributed systems with a consideration for confidentiality of data; i.e. it is the negotiations among local cluster centers that are used in t...

متن کامل

Introduction to Cross-Entropy Clustering The R Package CEC

The R Package CEC Kamieniecki and Spurek (2014) performs clustering based on the cross–entropy clustering (CEC) method, which was recently developed with the use of information theory. The main advantage of CEC is that it combines the speed and simplicity of k-means with the ability to use various Gaussian mixture models and reduce unnecessary clusters. In this work we present a practical tutor...

متن کامل

Detection of Elliptical Shapes via Cross-Entropy Clustering

The problem of finding elliptical shapes in an image will be considered. We discuss the new solution which uses cross-entropy clustering, providing the theoretical background of this approach. The proposed algorithm allows search for ellipses with predefined sizes and position in the space. Moreover, it works well in higher dimensions.

متن کامل

Combined Forecasting of Rainfall Based on Fuzzy Clustering and Cross Entropy

Rainfall is an essential index to measure drought, and it is dependent upon various parameters including geographical environment, air temperature and pressure. The nonlinear nature of climatic variables leads to problems such as poor accuracy and instability in traditional forecasting methods. In this paper, the combined forecasting method based on data mining technology and cross entropy is p...

متن کامل

Application of the cross-entropy method to clustering and vector quantization

We apply the cross-entropy (CE) method to problems in clustering and vector quantization. The CE algorithm involves the following iterative steps: (a) the generation of clusters according to a certain parametric probability distribution, (b) updating the parameters of this distribution according to the Kullback-Leibler cross-entropy. Through various numerical experiments we demonstrate the high...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Pattern Recognition

دوره 47  شماره 

صفحات  -

تاریخ انتشار 2014